week 7: multilevel models

multilevel adventures

mlm

Often, there are opportunities to cluster your observations – repeated measures, group membership, hierarchies, even different measures for the same particiapnt. Whenever you can cluster, you should!

  • Aggregation is bad
  • Regressions within regressions (ie coefficients as outcomes)
  • Questions at different levels
  • Variance decomposition
  • Learning from other data through pooling/shrinkage
  • Parameters that depend on parameters

today’s topics

  • More than one clustering variable
  • multilevel slopes

more than one type of cluster

McElreath doesn’t cover this in his video lecture, but this is from the textbook and worth discussing.

Data from Silk et al. (2005)

data(chimpanzees, package="rethinking")
d <- chimpanzees
rethinking::precis(d)
                   mean         sd  5.5%  94.5%    histogram
actor         4.0000000  2.0019871 1.000  7.000 ▇▇▁▇▁▇▁▇▁▇▁▇
recipient     5.0000000  2.0039801 2.000  8.000 ▇▇▁▇▁▇▁▇▁▇▁▇
condition     0.5000000  0.5004968 0.000  1.000   ▇▁▁▁▁▁▁▁▁▇
block         3.5000000  1.7095219 1.000  6.000   ▇▇▁▇▁▇▁▇▁▇
trial        36.5000000 20.8032533 4.665 68.335     ▇▇▇▇▇▇▇▁
prosoc_left   0.5000000  0.5004968 0.000  1.000   ▇▁▁▁▁▁▁▁▁▇
chose_prosoc  0.5674603  0.4959204 0.000  1.000   ▅▁▁▁▁▁▁▁▁▇
pulled_left   0.5793651  0.4941515 0.000  1.000   ▅▁▁▁▁▁▁▁▁▇
unique(d$actor)
[1] 1 2 3 4 5 6 7
unique(d$block)
[1] 1 2 3 4 5 6
unique(d$prosoc_left)
[1] 0 1
unique(d$condition)
[1] 0 1

We could model the interaction between condition (presence/absence of another animal) and option (which side is prosocial), but it is more difficult to assign sensible priors to interaction effects. Another option, because we’re working with categorical variables, is to turn our 2x2 into one variable with 4 levels.

d$treatment <- factor(1 + d$prosoc_left + 2*d$condition)
d %>% count(treatment, prosoc_left, condition)
  treatment prosoc_left condition   n
1         1           0         0 126
2         2           1         0 126
3         3           0         1 126
4         4           1         1 126
t_labels = c("r/n", "l/n", "r/p", "l/p")

In this experiment, each pull is within a cluster of pulls belonging to an individual chimpanzee. But each pull is also within an experimental block, which represents a collection of observations that happened on the same day. So each observed pull belongs to both an actor (1 to 7) and a block (1 to 6). There may be unique intercepts for each actor as well as for each block.

Mathematical model:

\[\begin{align*} L_i &\sim \text{Binomial}(1, p_i) \\ \text{logit}(p_i) &= \bar{\alpha} + \alpha_{\text{ACTOR[i]}} + \bar{\gamma} + \gamma_{\text{BLOCK[i]}} + \beta_{\text{TREATMENT[i]}} \\ \beta_j &\sim \text{Normal}(0, 0.5) \text{ , for }j=1..4\\ \alpha_j &\sim \text{Normal}(0, \sigma_{\alpha}) \text{ , for }j=1..7\\ \gamma_j &\sim \text{Normal}(0, \sigma_{\gamma}) \text{ , for }j=1..7\\ \bar{\alpha} &\sim \text{Normal}(0, 1.5) \\ \bar{\gamma} &\sim \text{Normal}(0, 1.5) \\ \sigma_{\alpha} &\sim \text{Exponential}(1) \\ \sigma_{\gamma} &\sim \text{Exponential}(1) \\ \end{align*}\]

m1 <- 
  brm(
    family = bernoulli,
    data = d, 
    pulled_left ~ 0 + treatment + (1 | actor) + (1 | block), 
    prior = c(prior(normal(0, 1.5), class = b),
              prior(exponential(1), class = sd, group = actor),
              prior(exponential(1), class = sd, group = block)),
    chains=4, cores=4, iter=2000, warmup=1000,
    seed = 1,
    file = here("files/models/71.1")
  )
m1
 Family: bernoulli 
  Links: mu = logit 
Formula: pulled_left ~ 0 + treatment + (1 | actor) + (1 | block) 
   Data: d (Number of observations: 504) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~actor (Number of levels: 7) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     1.96      0.63     1.06     3.45 1.00     1250     1660

~block (Number of levels: 6) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.20      0.17     0.01     0.63 1.00     1351     1690

Regression Coefficients:
           Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
treatment1     0.25      0.57    -0.90     1.38 1.00     1083     1577
treatment2     0.85      0.57    -0.31     1.94 1.00     1044     1743
treatment3    -0.16      0.57    -1.33     0.94 1.00     1062     1573
treatment4     0.72      0.57    -0.46     1.81 1.00     1022     1565

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
posterior_summary(m1)
                          Estimate Est.Error          Q2.5        Q97.5
b_treatment1            0.25346086 0.5690543 -8.956685e-01    1.3752420
b_treatment2            0.85408515 0.5716742 -3.133519e-01    1.9394030
b_treatment3           -0.16255802 0.5734419 -1.331065e+00    0.9439413
b_treatment4            0.72272758 0.5735628 -4.574810e-01    1.8120448
sd_actor__Intercept     1.95765160 0.6271925  1.058292e+00    3.4481233
sd_block__Intercept     0.20429722 0.1714939  7.619742e-03    0.6298440
r_actor[1,Intercept]   -0.76806254 0.5871299 -1.884859e+00    0.4127745
r_actor[2,Intercept]    4.16869772 1.2800422  2.187441e+00    7.2504878
r_actor[3,Intercept]   -1.07340338 0.5831362 -2.164617e+00    0.1178102
r_actor[4,Intercept]   -1.06777362 0.5979555 -2.198872e+00    0.1664252
r_actor[5,Intercept]   -0.76831595 0.5940808 -1.901701e+00    0.4779332
r_actor[6,Intercept]    0.18036694 0.5931503 -9.723677e-01    1.3304202
r_actor[7,Intercept]    1.71737307 0.6527872  4.933997e-01    3.0887395
r_block[1,Intercept]   -0.15780279 0.2170111 -6.890853e-01    0.1378163
r_block[2,Intercept]    0.03741492 0.1817171 -3.388320e-01    0.4520013
r_block[3,Intercept]    0.05159167 0.1837081 -2.778782e-01    0.4987016
r_block[4,Intercept]    0.01488767 0.1817779 -3.471849e-01    0.4117812
r_block[5,Intercept]   -0.02724225 0.1824159 -4.304912e-01    0.3526096
r_block[6,Intercept]    0.10603610 0.1960600 -2.018372e-01    0.5857896
lprior                 -8.04858139 0.8743820 -1.025541e+01   -6.8249218
lp__                 -288.90426240 3.8659615 -2.973491e+02 -282.5082745
m1 %>% 
  mcmc_plot(variable = c("^r_", "^b_", "^sd_"), regex = T) +
  theme(axis.text.y = element_text(hjust = 0))

Zooming in on just the actor and block effects. (Remember, these are differences from the weighted average.)

m1 %>% 
  mcmc_plot(variable = c("^r_"), regex = T) +
  theme(axis.text.y = element_text(hjust = 0))
Code
as_draws_df(m1) %>% 
  select(starts_with("sd")) %>% 
  pivot_longer(everything()) %>% 
  ggplot(aes(x = value, fill = name)) +
  geom_density(linewidth = 0, alpha = 3/4, adjust = 2/3, show.legend = F) +
  annotate(geom = "text", x = 0.67, y = 2, label = "block", color = "#5e8485") +
  annotate(geom = "text", x = 2.725, y = 0.5, label = "actor", color = "#0f393a") +
  scale_fill_manual(values = c("#0f393a", "#5e8485")) +
  scale_y_continuous(NULL, breaks = NULL) +
  ggtitle(expression(sigma["group"])) +
  coord_cartesian(xlim = c(0, 4))

Let’s look at the predictions by actor and block to confirm.

Code
d %>% distinct(actor, block, treatment) %>% 
  add_epred_draws(m1) %>% 
  mutate(treatment=factor(treatment,
                          levels=as.character(1:4),
                          labels=t_labels)) %>% 
  group_by(actor, treatment) %>% 
  mean_qi(.epred) %>% 
  ggplot( aes(x=treatment, y=.epred, group=1) ) +
  geom_point() +
  geom_line() +
  labs(x=NULL, y="p(pull left)", title="by actor") +
  facet_wrap(~actor)
Code
d %>% distinct(actor, block, treatment) %>% 
  add_epred_draws(m1) %>% 
  mutate(treatment=factor(treatment,
                          levels=as.character(1:4),
                          labels=t_labels)) %>% 
  group_by(block, treatment) %>% 
  mean_qi(.epred) %>% 
  ggplot( aes(x=treatment, y=.epred, group=1) ) +
  geom_point() +
  geom_line() +
  labs(x=NULL, y="p(pull left)", title="by block") +
  facet_wrap(~block)

slopes

Let’s start by simulating cafe data.

# ---- set population-level parameters -----
a <- 3.5       # average morning wait time
b <- (-1)      # average difference afternoon wait time
sigma_a <- 1   # std dev in intercepts
sigma_b <- 0.5 # std dev in slopes
rho <- (-0.7)  #correlation between intercepts and slopes

# ---- create vector of means ----
Mu <- c(a, b)

# ---- create matrix of variances and covariances ----
sigmas <- c(sigma_a,sigma_b) # standard deviations
Rho <- matrix( c(1,rho,rho,1) , nrow=2 ) # correlation matrix
# now matrix multiply to get covariance matrix
Sigma <- diag(sigmas) %*% Rho %*% diag(sigmas)

# ---- simulate intercepts and slopes -----
N_cafes = 20
library(MASS)
set.seed(5)
vary_effects <- mvrnorm( n=N_cafes, mu = Mu, Sigma=Sigma)
a_cafe <- vary_effects[, 1]
b_cafe <- vary_effects[, 2]

# ---- simulate observations -----

set.seed(22)
N_visits <- 10
afternoon <- rep(0:1,N_visits*N_cafes/2)
cafe_id <- rep( 1:N_cafes , each=N_visits )
mu <- a_cafe[cafe_id] + b_cafe[cafe_id]*afternoon
sigma <- 0.5 # std dev within cafes
wait <- rnorm( N_visits*N_cafes , mu , sigma )
d <- data.frame( cafe=cafe_id , afternoon=afternoon , wait=wait )
rethinking::precis(d)
               mean        sd     5.5%     94.5%      histogram
cafe      10.500000 5.7807513 2.000000 19.000000     ▇▇▇▇▇▇▇▇▇▇
afternoon  0.500000 0.5012547 0.000000  1.000000     ▇▁▁▁▁▁▁▁▁▇
wait       3.073405 1.1082252 1.477719  4.869957 ▁▁▂▅▇▇▅▇▃▃▁▁▁▁
d %>% filter(cafe==1)
   cafe afternoon     wait
1     1         0 3.967893
2     1         1 3.857198
3     1         0 4.727875
4     1         1 2.761013
5     1         0 4.119483
6     1         1 3.543652
7     1         0 4.190949
8     1         1 2.533223
9     1         0 4.124032
10    1         1 2.764887

a simulation note from RM

In this exercise, we are simulating data from a generative process and then analyzing that data with a model that reflects exactly the correct structure of that process. But in the real world, we’re never so lucky. Instead we are always forced to analyze data with a model that is MISSPECIFIED: The true data-generating process is different than the model. Simulation can be used however to explore misspecification. Just simulate data from a process and then see how a number of models, none of which match exactly the data-generating process, perform. And always remember that Bayesian inference does not depend upon data-generating assumptions, such as the likelihood, being true. Non-Bayesian approaches may depend upon sampling distributions for their inferences, but this is not the case for a Bayesian model. In a Bayesian model, a likelihood is a prior for the data, and inference about parameters can be surprisingly insensitive to its details.

Mathematical model:

likelihood function and linear model

\[\begin{align*} W_i &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha_{CAFE[i]} + \beta_{CAFE[i]}(\text{Afternoon}_i) \end{align*}\]

varying intercepts and slopes

\[\begin{align*} \begin{bmatrix} \alpha_{CAFE[i]} \\ \beta_{CAFE[i]} \end{bmatrix} &\sim \text{MVNormal}( \begin{bmatrix} \alpha \\ \beta \end{bmatrix}, \mathbf{S}) \\ \mathbf{S} &\sim \begin{pmatrix} \sigma_{\alpha}, & 0 \\ 0, & \sigma_{\beta}\end{pmatrix}\mathbf{R}\begin{pmatrix} \sigma_{\alpha}, & 0 \\ 0, & \sigma_{\beta}\end{pmatrix} \\ \end{align*}\]

priors

\[\begin{align*} \alpha &\sim \text{Normal}(5,2) \\ \beta &\sim \text{Normal}(-1,0.5) \\ \sigma &\sim \text{Exponential}(1) \\ \sigma_{\alpha} &\sim \text{Exponential}(1) \\ \sigma_{\beta} &\sim \text{Exponential}(1) \\ \mathbf{R} &\sim \text{LKJcorr}(2) \end{align*}\]

LKJ correlation prior

Code
# examples
rlkj_1 = rethinking::rlkjcorr(1e4, K=2, eta=1)
rlkj_2 = rethinking::rlkjcorr(1e4, K=2, eta=2)
rlkj_4 = rethinking::rlkjcorr(1e4, K=2, eta=4)
rlkj_6 = rethinking::rlkjcorr(1e4, K=2, eta=6)
data.frame(rlkj_1= rlkj_1[,1,2], 
           rlkj_2= rlkj_2[,1,2], 
           rlkj_4= rlkj_4[,1,2],
           rlkj_6= rlkj_6[,1,2]) %>% 
  ggplot() +
  geom_density(aes(x=rlkj_1, color = "1"), alpha=.3) +
  geom_density(aes(x=rlkj_2, color = "2"), alpha=.3) +
  geom_density(aes(x=rlkj_4, color = "4"), alpha=.3) +
  geom_density(aes(x=rlkj_6, color = "6"), alpha=.3) +
  labs(x="R", color="eta") +
  theme(legend.position = "top")
m2 <- brm(
  data = d,
  family = gaussian,
  wait ~ 1 + afternoon + (1 + afternoon | cafe),
  prior = c(
    prior( normal(5,2),    class=Intercept ), 
    prior( normal(-1, .5), class=b),
    prior( exponential(1), class=sd),
    prior( exponential(1), class=sigma),
    prior( lkj(2),         class=cor)
  ), 
  iter=2000, warmup=1000, chains=4, cores=4, seed=9,
  file=here("files/models/71.2")
)
posterior_summary(m2) %>% round(2)
                               Estimate Est.Error    Q2.5   Q97.5
b_Intercept                        3.66      0.23    3.21    4.11
b_afternoon                       -1.13      0.14   -1.41   -0.85
sd_cafe__Intercept                 0.97      0.17    0.71    1.37
sd_cafe__afternoon                 0.59      0.12    0.39    0.87
cor_cafe__Intercept__afternoon    -0.51      0.18   -0.80   -0.10
sigma                              0.47      0.03    0.42    0.53
Intercept                          3.10      0.20    2.71    3.50
r_cafe[1,Intercept]                0.55      0.29   -0.01    1.14
r_cafe[2,Intercept]               -1.50      0.30   -2.10   -0.90
r_cafe[3,Intercept]                0.70      0.30    0.12    1.30
r_cafe[4,Intercept]               -0.42      0.29   -0.98    0.15
r_cafe[5,Intercept]               -1.79      0.30   -2.36   -1.20
r_cafe[6,Intercept]                0.60      0.29    0.02    1.18
r_cafe[7,Intercept]               -0.05      0.29   -0.64    0.55
r_cafe[8,Intercept]                0.28      0.30   -0.31    0.86
r_cafe[9,Intercept]                0.32      0.29   -0.28    0.90
r_cafe[10,Intercept]              -0.10      0.29   -0.68    0.49
r_cafe[11,Intercept]              -1.74      0.29   -2.31   -1.17
r_cafe[12,Intercept]               0.18      0.29   -0.40    0.76
r_cafe[13,Intercept]               0.22      0.30   -0.37    0.82
r_cafe[14,Intercept]              -0.49      0.30   -1.08    0.09
r_cafe[15,Intercept]               0.79      0.30    0.22    1.39
r_cafe[16,Intercept]              -0.27      0.29   -0.86    0.32
r_cafe[17,Intercept]               0.55      0.30   -0.03    1.15
r_cafe[18,Intercept]               2.08      0.30    1.52    2.68
r_cafe[19,Intercept]              -0.42      0.30   -1.00    0.16
r_cafe[20,Intercept]               0.07      0.30   -0.53    0.65
r_cafe[1,afternoon]               -0.03      0.29   -0.60    0.53
r_cafe[2,afternoon]                0.23      0.30   -0.36    0.79
r_cafe[3,afternoon]               -0.80      0.30   -1.39   -0.24
r_cafe[4,afternoon]               -0.10      0.28   -0.65    0.43
r_cafe[5,afternoon]                1.00      0.30    0.42    1.57
r_cafe[6,afternoon]               -0.16      0.28   -0.72    0.38
r_cafe[7,afternoon]                0.10      0.28   -0.46    0.64
r_cafe[8,afternoon]               -0.50      0.29   -1.09    0.07
r_cafe[9,afternoon]               -0.17      0.28   -0.73    0.38
r_cafe[10,afternoon]               0.18      0.29   -0.39    0.75
r_cafe[11,afternoon]               0.70      0.29    0.15    1.27
r_cafe[12,afternoon]              -0.06      0.29   -0.63    0.50
r_cafe[13,afternoon]              -0.68      0.29   -1.25   -0.12
r_cafe[14,afternoon]               0.19      0.29   -0.36    0.78
r_cafe[15,afternoon]              -1.06      0.31   -1.65   -0.44
r_cafe[16,afternoon]               0.09      0.28   -0.47    0.64
r_cafe[17,afternoon]              -0.09      0.28   -0.64    0.47
r_cafe[18,afternoon]               0.10      0.30   -0.49    0.70
r_cafe[19,afternoon]               0.87      0.30    0.29    1.47
r_cafe[20,afternoon]               0.07      0.29   -0.49    0.66
lprior                            -5.06      0.44   -6.07   -4.36
lp__                            -197.20      7.16 -212.07 -184.27

Let’s get the slopes and intercepts for each cafe.

Code
intercepts = coef(m2)$cafe[ ,, "Intercept"]
slopes = coef(m2)$cafe[,, "afternoon"]
cafe_params = data.frame(
  cafe=1:20,
  intercepts=intercepts[, 1],
  slopes=slopes[, 1]
) 
cafe_params %>% round(3)
   cafe intercepts slopes
1     1      4.215 -1.156
2     2      2.158 -0.904
3     3      4.367 -1.928
4     4      3.243 -1.232
5     5      1.875 -0.135
6     6      4.260 -1.294
7     7      3.617 -1.027
8     8      3.945 -1.633
9     9      3.979 -1.305
10   10      3.563 -0.953
11   11      1.927 -0.430
12   12      3.841 -1.186
13   13      3.881 -1.809
14   14      3.175 -0.938
15   15      4.455 -2.192
16   16      3.390 -1.040
17   17      4.217 -1.219
18   18      5.748 -1.028
19   19      3.247 -0.260
20   20      3.729 -1.056
Code
cafe_params %>% 
  ggplot( aes(x=intercepts, y=slopes) ) +
  geom_point(size = 2) 
Code
cafe_params %>% 
  ggplot( aes(x=intercepts, y=slopes) ) +
  stat_ellipse() +
  geom_point(size = 2) 
Code
cafe_params %>% 
  ggplot( aes(x=intercepts, y=slopes) ) +
  mapply(function(level) {
    stat_ellipse(geom  = "polygon", type = "norm",
                 linewidth = 0, alpha = .1, fill = "#1c5253",
                 level = level)
    }, 
    # enter the levels here
    level = c(1:9 / 10, .99)) +
  geom_point(size = 2) 

More about stat_ellipse here.

exercise

Now use the slopes and intercepts to calculate the expected morning and afternoon wait times for each cafe. Plot these as a scatterplot. Bonus for ellipses.

Code
cafe_params %>% 
  mutate(
    morning = intercepts, 
    afternoon = intercepts + slopes
  ) %>% 
  ggplot( aes(x=morning, y=afternoon) ) +
  mapply(function(level) {
    stat_ellipse(geom  = "polygon", type = "norm",
                 linewidth = 0, alpha = .1, fill = "#1c5253",
                 level = level)
    }, 
    # enter the levels here
    level = c(1:9 / 10, .99)) +
  geom_point(size = 2)+
  labs(x="morning wait time",
       y="afternoon wait time")

What is the covariance of our intercepts and slopes?

Code
post = as_draws_df(m2)
rlkj_2 = rethinking::rlkjcorr(nrow(post), K=2, eta=2)

data.frame(prior= rlkj_2[,1,2],
           posterior = post$cor_cafe__Intercept__afternoon) %>% 
  ggplot() +
  geom_density(aes(x=prior, color = "prior"), alpha=.3) +
  geom_density(aes(x=posterior, color = "posterior"), alpha=.3) +
  scale_color_manual(values = c("#1c5253" , "#e07a5f")) +
  labs(x="R") +
  theme(legend.position = "top")

predictors at different levels

Returning to our personality data from Tuesday.

data_path = "https://raw.githubusercontent.com/sjweston/uobayes/refs/heads/main/files/data/external_data/mlm.csv"
d <- read.csv(data_path)

d %>% count(wave)
  wave  n
1    1 91
2    2 88
3    3 35
4    4  5
d %>% count(group)
  group   n
1  CTRL  58
2    PD 161
Code
d %>% 
  ggplot(aes(x=week, y=con, group=id)) +
  geom_line(alpha=.3) +
  facet_wrap(~group)

week is a within-person variable (Level 1), whereas group is a between-person variable (Level 2).

Mathematical model with level-2 covariate

Likelihood function and linear model

\[\begin{align*} \text{con}_i &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha_{\text{id}[i]} + \beta_{\text{id}[i]}t_i \end{align*}\]

Varying intercepts and slopes with level-2 predictor:

\[\begin{align*} \alpha_{\text{id}[i]} &= \alpha + \gamma_\alpha G_{\text{id}[i]} + u_{\alpha,\text{id}[i]} \\ \beta_{\text{id}[i]} &= \beta + \gamma_\beta G_{\text{id}[i]} + u_{\beta,\text{id}[i]} \end{align*}\]

Random effects:

\[\begin{align*} \begin{bmatrix} u_{\alpha,\text{id}[i]} \\ u_{\beta,\text{id}[i]} \end{bmatrix} &\sim \text{MVNormal}\left(\begin{bmatrix} 0 \\ 0 \end{bmatrix}, \mathbf{S}\right) \\ \mathbf{S} &= \begin{pmatrix} \sigma_{\alpha} & 0 \\ 0 & \sigma_{\beta}\end{pmatrix}\mathbf{R}\begin{pmatrix} \sigma_{\alpha} & 0 \\ 0 & \sigma_{\beta}\end{pmatrix} \end{align*}\]

Priors: \[\begin{align*} \alpha &\sim \text{Normal}(0,1) \\ \beta &\sim \text{Normal}(0,1) \\ \gamma_\alpha &\sim \text{Normal}(0,1) \\ \gamma_\beta &\sim \text{Normal}(0,1) \\ \sigma &\sim \text{Exponential}(1) \\ \sigma_{\alpha} &\sim \text{Exponential}(1) \\ \sigma_{\beta} &\sim \text{Exponential}(1) \\ \mathbf{R} &\sim \text{LKJcorr}(2) \end{align*}\]

m3 <- brm(
  data=d,
  family=gaussian,
  con ~ 1 + week + group + week:group + (1 + week | id),
  prior = c( prior( normal(0,1),    class=Intercept ),
             prior( normal(0,1),    class=b ),
             prior( exponential(1), class=sigma ),
             prior( exponential(1), class=sd ),
             prior( lkj(2),         class=cor)),
  iter=4000, warmup=1000, seed=9, cores=4, chains=4,
  file=here("files/models/71.3")
)

Divergent transitions! What do we do?

m3
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: con ~ 1 + week + group + week:group + (1 + week | id) 
   Data: d (Number of observations: 216) 
  Draws: 4 chains, each with iter = 4000; warmup = 1000; thin = 1;
         total post-warmup draws = 12000

Multilevel Hyperparameters:
~id (Number of levels: 88) 
                    Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)           0.06      0.01     0.05     0.07 1.00     4017     6305
sd(week)                0.00      0.00     0.00     0.01 1.00     1909     2432
cor(Intercept,week)    -0.08      0.39    -0.75     0.72 1.00    12802     7985

Regression Coefficients:
             Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept        0.20      0.01     0.17     0.23 1.00     5915     7919
week             0.00      0.00    -0.01     0.01 1.00    10943     7070
groupPD         -0.01      0.02    -0.04     0.02 1.00     6071     7209
week:groupPD    -0.01      0.01    -0.02     0.01 1.00    11028     6548

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.05      0.00     0.04     0.05 1.00     3847     4795

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
as_draws_df(m3) %>% 
  ggplot( aes(x=b_Intercept, y=b_groupPD) ) +
    geom_point()

Center your variables:

d = d %>% 
  mutate(
    across( c(con, week),
            ~.- mean(., na.rm=T),
            .names="{col}_c"))

m3b <- brm(
  data=d,
  family=gaussian,
  con_c ~ 1 + week_c + group + week_c:group + (1 + week_c | id),
  prior = c( prior( normal(0,1),    class=Intercept ),
             prior( normal(0,1),    class=b ),
             prior( exponential(1), class=sigma ),
             prior( exponential(1), class=sd ),
             prior( lkj(2),         class=cor)),
  iter=4000, warmup=1000, seed=9, cores=4, chains=4,
  file=here("files/models/71.3b")
)
p1 = as_draws_df(m3) %>% 
  ggplot( aes(x=sd_id__week, 
              y=cor_id__Intercept__week) ) +
  geom_point()

p2 = as_draws_df(m3b) %>% 
  ggplot( aes(x=sd_id__week_c, y=cor_id__Intercept__week_c) ) +
  geom_point()

p1 + p2